New algorithm for constructing area-based index with geographical heterogeneities and variable selection: An application to gastric cancer screening
نویسندگان
چکیده
To optimally allocate health resources, policy planners require an indicator reflecting the inequality. Currently, health inequalities are frequently measured by area-based indices. However, methodologies for constructing the indices have been hampered by two difficulties: 1) incorporating the geographical relationship into the model and 2) selecting appropriate variables from the high-dimensional census data. Here, we constructed a new area-based health coverage index using the geographical information and a variable selection procedure with the example of gastric cancer. We also characterized the geographical distribution of health inequality in Japan. To construct the index, we proposed a methodology of a geographically weighted logistic lasso model. We adopted a geographical kernel and selected the optimal bandwidth and the regularization parameters by a two-stage algorithm. Sensitivity was checked by correlation to several cancer mortalities/screening rates. Lastly, we mapped the current distribution of health inequality in Japan and detected unique predictors at sampled locations. The interquartile range of the index was 0.0001 to 0.354 (mean: 0.178, SD: 0.109). The selections from 91 candidate variables in Japanese census data showed regional heterogeneities (median number of selected variables: 29). Our index was more correlated to cancer mortalities/screening rates than previous index and revealed several geographical clusters with unique predictors.
منابع مشابه
ارزیابی چندشکلی ژن کاسپاز 3 و 9 در بیماران مبتلا به سرطان معده در استان مازندران: گزارش کوتاه
Background: Gastric cancer is the most prevalent cancer with poor survival in gastrointestinal tract. Caspase 3 and 9 play an important role in the development and progression of cancer. Polymorphisms in the genes for these enzymes can affect gene activity and thus may influence susceptibility to gastric cancer. In this study, caspase 3 and 9 genes polymorphisms in patients with gastric cancer ...
متن کاملA Random Forest Classifier based on Genetic Algorithm for Cardiovascular Diseases Diagnosis (RESEARCH NOTE)
Machine learning-based classification techniques provide support for the decision making process in the field of healthcare, especially in disease diagnosis, prognosis and screening. Healthcare datasets are voluminous in nature and their high dimensionality problem comprises in terms of slower learning rate and higher computational cost. Feature selection is expected to deal with the high dimen...
متن کاملExploring Gene Signatures in Different Molecular Subtypes of Gastric Cancer (MSS/ TP53+, MSS/TP53-): A Network-based and Machine Learning Approach
Gastric cancer (GC) is one of the leading causes of cancer mortality, worldwide. Molecular understanding of GC’s different subtypes is still dismal and it is necessary to develop new subtype-specific diagnostic and therapeutic approaches. Therefore developing comprehensive research in this area is demanding to have a deeper insight into molecular processes, underlying these subtypes. In this st...
متن کاملSFLA Based Gene Selection Approach for Improving Cancer Classification Accuracy
In this paper, we propose a new gene selection algorithm based on Shuffled Frog Leaping Algorithm that is called SFLA-FS. The proposed algorithm is used for improving cancer classification accuracy. Most of the biological datasets such as cancer datasets have a large number of genes and few samples. However, most of these genes are not usable in some tasks for example in cancer classification....
متن کاملSustainable Supplier Selection by a New Hybrid Support Vector-model based on the Cuckoo Optimization Algorithm
For assessing and selecting sustainable suppliers, this study considers a triple-bottom-line approach, including profit, people and planet, and regards business operations, environmental effects along with social responsibilities of the suppliers. Diverse metrics are acquainted with measure execution in these three issues. This study builds up a new hybrid intelligent model, namely COA-LS-SVM, ...
متن کامل